Skip to content

feat(core, anthropic, openai): aclose() lifecycle + latched fallbacks#37718

Draft
Bagatur (baskaryan) wants to merge 3 commits into
masterfrom
feat/chat-model-aclose
Draft

feat(core, anthropic, openai): aclose() lifecycle + latched fallbacks#37718
Bagatur (baskaryan) wants to merge 3 commits into
masterfrom
feat/chat-model-aclose

Conversation

@baskaryan
Copy link
Copy Markdown
Collaborator

@baskaryan Bagatur (baskaryan) commented May 27, 2026

Summary

Adds an explicit resource-lifecycle contract to chat models, an opt-in FallbackLatch circuit-breaker for with_fallbacks(...), and close()/aclose() propagation through RunnableWithFallbacks.

langchain-core

  • BaseChatModel.close() / aclose() — default no-ops that subclasses override. aclose() dispatches to close() so async teardown works for sync-only subclasses. Adds __enter__/__exit__/__aenter__/__aexit__ for context-manager use.
  • RunnableWithFallbacks.close() / aclose() — walk runnable + fallbacks, calling each one's lifecycle method; per-runnable failures are suppressed so one bad close doesn't block the rest.
  • FallbackLatch + with_fallbacks(..., latch=...) — opt-in circuit breaker. Once the primary raises a handled exception the latch trips and subsequent calls (on this wrapper, or any wrapper sharing the latch) skip the primary. The latch propagates through __getattr__ rebinds (bind_tools, bind, …) so tool-bound and bare wrappers share one circuit. latch.reset() re-enables the primary. Default (no latch) behavior is unchanged.

langchain-anthropic / langchain-openai

close()/aclose() release the underlying httpx client only when the model privately owns it.

This is the important subtlety: both integrations back their SDK clients with a process-wide shared httpx pool via @lru_cache (_get_default_*httpx_client). Every model with the same base_url/timeout/proxy reuses one pool, by design. An earlier revision of this PR closed that shared pool on teardown, which broke every other live model in the process:

RuntimeError: Cannot send a request, as the client has been closed.
-> anthropic.APIConnectionError: Connection error.

So ownership is now tracked explicitly:

  • anthropic: ChatAnthropic always wraps the shared cached pool (no http_client field), so close()/aclose() are no-ops for the pool, guarded by an identity check against the lru-cache getter (defensive against any future private-client path).
  • openai: a client is owned iff the model built it privately — the unhashable-httpx.Timeout fresh-client path or an openai_proxy transport — and the user didn't supply their own http_client/http_async_client. Shared-cache and user-supplied clients are never closed.

This makes aclose() safe to call after every use without disturbing sibling models or pools the caller owns. It is most useful for the privately-owned and user-injected-and-managed cases, and as a uniform, provider-agnostic teardown hook for frameworks (LangGraph runtimes, agent orchestrators) that previously had to reach into private SDK attributes.

Release Note

none

Test Plan

  • core: lifecycle + context-manager tests for BaseChatModel; latch trip/reset/shared/propagation + close/aclose propagation tests for RunnableWithFallbacks.
  • anthropic: regression test — two models share the lru-cached pool, close one, assert the other's pool stays open; plus owned-path + uninstantiated-noop tests.
  • openai: shared-pool regression test (sync + async), owned-via-proxy / owned-via-unhashable-timeout flag tests, user-injected-not-closed test, owned-client-closed test.
  • Full suites pass: core runnables + language_models, libs/partners/anthropic (106), libs/partners/openai chat_models/test_base (201). Lint + mypy clean.

🤖 Generated with Claude Code

Plumb an explicit resource-lifecycle contract through `BaseChatModel`,
`RunnableWithFallbacks`, and the two largest partner integrations.
Adds an opt-in `FallbackLatch` so `with_fallbacks(...)` can short-circuit
the primary after a failure.

Motivation: provider SDKs (`anthropic`, `openai`) back their clients with
httpx connection pools that the SDKs only release best-effort from
`__del__` — `asyncio.get_running_loop().create_task(self.aclose())` with
a bare `except Exception: pass`. Long-lived workers that construct
chat models per request (multi-tenant LangGraph deployments,
agents-as-services) silently accumulate pools and leak memory + file
descriptors. The fix today requires reaching into private attributes
(`_async_client`, `root_async_client`, ...) on each provider. This PR
makes teardown a first-class part of the chat-model API.

## langchain-core

- `BaseChatModel.close()` / `aclose()` — default no-ops that subclasses
  override. `aclose()` dispatches to `close()` so async teardown works
  for sync-only subclasses. Adds `__enter__`/`__exit__`/`__aenter__`/
  `__aexit__` so models can be used as context managers.
- `RunnableWithFallbacks.close()` / `aclose()` — walks `runnable` and
  `fallbacks`, calling each one's lifecycle method. Per-runnable failures
  are suppressed so one bad close doesn't prevent the others from
  running.
- `FallbackLatch` + `with_fallbacks(..., latch=...)` — opt-in
  circuit-breaker: once the primary raises a handled exception, latch
  trips and subsequent calls (on this wrapper, or any wrapper sharing
  the same latch instance) skip the primary. Useful when a primary
  failure is unlikely to recover within the wrapper's lifetime — wrong
  API key, sustained outage — so the default
  `try-primary-on-every-call` doesn't waste a round-trip on every
  retry. `latch.reset()` re-enables the primary. The latch propagates
  through `__getattr__` rebinds (e.g. `wrapper.bind_tools([...])`) so
  tool-bound and bare wrappers share one circuit.
- Default-latch behaviour is unchanged: passing no `latch` retains the
  existing "retry primary on every call" semantics.

## langchain-anthropic

- `ChatAnthropic.close()` / `aclose()` — closes `_client` (sync) and
  `_async_client` (async). Both are `cached_property` slots; guarded
  via `__dict__` so we don't materialize an uninstantiated cached
  client just to immediately close it. Idempotent.

## langchain-openai

- `BaseChatOpenAI.close()` / `aclose()` — closes `root_client` and
  `root_async_client`, then clears the corresponding `client` /
  `async_client` attributes so the model can't be used after teardown.
  Idempotent. Tolerates the API-key-missing case where one client is
  `None`.

Note: `BaseChatOpenAI`'s eager construction of both sync + async
clients in its `model_validator` (even for async-only use) is a related
inefficiency but not addressed here — it's a fixed per-instance cost
rather than the per-request leak that `aclose()` solves.

## Tests

- 7 new latch + propagation tests in `test_fallbacks.py`
- 4 new lifecycle tests in `test_base.py` for `BaseChatModel`
- 5 new tests in `test_chat_models.py` for `ChatAnthropic`
- 5 new tests in `test_base.py` for `BaseChatOpenAI`

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added anthropic `langchain-anthropic` package issues & PRs core `langchain-core` package issues & PRs feature For PRs that implement a new feature; NOT A FEATURE REQUEST integration PR made that is related to a provider partner package integration internal openai `langchain-openai` package issues & PRs size: L 500-999 LOC labels May 27, 2026
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 27, 2026

Merging this PR will not alter performance

✅ 13 untouched benchmarks
⏩ 2 skipped benchmarks1


Comparing feat/chat-model-aclose (e0ace82) with master (1338871)2

Open in CodSpeed

Footnotes

  1. 2 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

  2. No successful run was found on master (4493b2c) during the generation of this report, so 1338871 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

Bagatur (baskaryan) and others added 2 commits May 27, 2026 16:42
mypy flags the `# type: ignore[override]` on the test subclasses'
`close()` / `aclose()` methods as unused — `BaseChatModel.close` /
`aclose` are concrete (non-abstract) defaults, so overriding them in a
subclass does not need an override suppression.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…se()

The first cut of close()/aclose() unconditionally closed the SDK
client's underlying httpx pool. But both integrations back their
clients with a PROCESS-WIDE SHARED pool via @lru_cache
(`_get_default_*httpx_client`): every model with the same
base_url/timeout/proxy reuses one pool by design. Closing it from a
single model's teardown broke every other live model in the process —
observed in a long-lived worker as:

    RuntimeError: Cannot send a request, as the client has been closed.
    -> anthropic.APIConnectionError: Connection error.

Fix: close()/aclose() now release the underlying httpx client ONLY when
the model privately owns it; the shared cached pool and user-supplied
clients are left intact.

- anthropic: `ChatAnthropic` always wraps the shared cached pool (it has
  no http_client field), so close()/aclose() are effectively no-ops for
  the pool. An identity check against the lru-cache getter
  (`_wraps_shared_httpx`) guards a hypothetical future private-client
  path. `_http_client_params()` is factored out so the cached_property
  builders and the identity check stay in sync.
- openai: ownership is computed in `validate_environment` and stored on
  `_owns_sync_http_client` / `_owns_async_http_client`. A client is owned
  iff the model built it privately — the unhashable-`httpx.Timeout`
  fresh-client path or an `openai_proxy` transport — and the user did not
  supply their own `http_client` / `http_async_client`. Default (shared
  cache) and user-supplied clients are never closed.

Tests rewritten to pin the invariant: a regression test builds two
default models, closes one, and asserts the other's shared pool is still
open; plus owned-path and user-injected-not-closed cases.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

anthropic `langchain-anthropic` package issues & PRs core `langchain-core` package issues & PRs feature For PRs that implement a new feature; NOT A FEATURE REQUEST integration PR made that is related to a provider partner package integration internal openai `langchain-openai` package issues & PRs size: L 500-999 LOC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant